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Characterization and Control of Diffusion Processes 

in Multi-Agent Networks 
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Abstract —Diffusion processes are instrumental to describe the 
movement of a continuous quantity in a generic network of 
interacting agents. Here, we present a probabilistic framework for 
diffusion in networks and propose to classify agent interactions 
according to two protocols where the total network quantity is 
conserved or variable. For both protocols, our focus is on asym¬ 
metric interactions between agents involving directed graphs. 
Specifically, we define how the dynamics of conservative and non¬ 
conservative networks relate to the weighted in-degree Laplacian 
and the weighted out-degree Laplacian. Our framework allows 
the addition and subtraction of the considered quantity to and 
from a set of nodes. This enables the modeling of stubborn 
agents with time-invariant quantities, and the process of dynamic 
learning. We highlight several stability and convergence charac¬ 
teristics of our framework, and define the conditions under which 
asymptotic convergence is guaranteed when the network topology 
is variable. In addition, we indicate how our framework accom¬ 
modates external network control and targeted network design. 
We show how network diffusion can be externally manipulated 
by applying time-varying input functions at individual nodes. 
Desirable network structures can also be constructed by adjusting 
the dominant diffusion modes. To this purpose, we propose a 
Markov decision process that learns these network adjustments 
through a reinforcement learning algorithm, suitable for large 
networks. The presented network control and design schemes 
enable flow modifications that allow the alteration of the dynamic 
and stationary behavior of the network in conservative and non¬ 
conservative networks. 

Keywords —multi-agent networks, diffusion process, directed 
graphs, graph Laplacian, network dynamics, network control, 
network design 


I. Introduction 

Large-scale network dynamics received ample research in¬ 
terest over the last decade in the context of group coordina¬ 
tion 0> distributed algorithms |2J, |3j, network control & 
distributed optimization jl5|, consensus problems 0, or 
herding and flocking behavior [8). Network dynamics involve 
interactions between individual agents and relate to the diffu¬ 
sion of a continuous quantity within a generic network [9|- 
(BJ. including assets among financial institutions and opin¬ 
ions within social networks. In this work, we establish a 
probabilistic diffusion framework that describes in continuous 
time the movement of such a continuous quantity or node 
property within a multi-agent network. The main contributions 
of our framework are (i) to classify linear update rules and 
characterize the corresponding dynamical network behavior, 
(ii) to include external inputs as control variables into the 
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framework resulting in the study of the network stability and 
convergence under different conditions, and (iii) to impose 
flow modifications by means of network control and network 
design. 


This framework builds on consensus models involving 
Markovian state transitions that have been formulated in dis¬ 
crete time 0, |T6)-© and continuous time (21), as well as 
multi-agent gossiping models describing interactions between 
pairs of agents ©• We generalize these models and introduce 
a classification of linear inter-agent update rules depending on 
whether the global quantity initially present in the network is 
conserved or not. This enables our framework to account for a 
wider range of network phenomena. For example, financial or 
trade assets could be modeled using conservative flows, while 
opinions follow non-conservative network dynamics. Going 
beyond symmetric, unweighted graphs [22], we focus on 
weighted graphs with asymmetric update rules and derive the 
corresponding differential equation that describes the diffusion 
of the considered quantity averaged over all sample paths. The 
nature of the update rule affects the stability and convergence 
of diffusion processes over networks 0 - We highlight the 
differences in the transient and stationary characteristics of 
conservative and non-conservative networks, the effects of 
network asymmetry, as well as the conditions for convergence 
in networks with switching topologies. 


Building on existing leader-follower models 1231, p4) , we 
extend the homogeneous differential equation that describes 
the diffusion process to its inhomogeneous form and use the 
inhomogeneous term to model an exogenous quantity input. 
By doing so, we can model the addition and subtraction of the 
considered quantity to and from the multi-agent network. Our 
framework unifies several types of systems involving external 
influences, including external restrictions on the quantity val¬ 
ues of selected agents, as well as externally imposed reference 
values obtained from measurements or predefined system 
targets. For instance, we demonstrate that diffusion processes 
considered by other authors, such as the presence of stubborn 
agents G3 or the process of dynamic learning (19) , can be 
expressed in terms of the inhomogeneous differential equation. 
In addition, we demonstrate that under some constraints on 
the input vector and the network topology, networks with 
exogenous excitations can remain stable and converge to a 
steady state. 


Our framework enables network control through the in¬ 
troduction of external excitations at individual nodes or the 
adjustment of the network structure. By transforming the 
basis of agents using standard results in state-space control 
to isolate the time dynamics of the characteristic modes of the 
system from one another |25|, and implementing a suitable 
controller for one of the modes, we can design time-varying 
input functions for each of the individual agents. A suitable 
choice of the controller will allow the network designer to 




2 


guide the system towards a desired objective in terms of 
diffusion control. Different types of networks are associated 
with different modal compositions and require different input 
design considerations for the achievement of control. Besides 
externally imposing control on a network, we also address 
the modification of inter-agent interactions to impose network 
control. By modifying the transition rate matrix that governs 
the network dynamics, the eigenvalues of the characteristic 
modes of the system are shifted, and the dynamics of the 
system are modified. Furthermore, we present an adaptive 
heuristic that models the modification of the network structure 
as a Markov decision process (MDP). In this MDP, the state 
space is defined as the set of individual agents, while the action 
space is defined as perturbations to the transition rate matrix 
governing the system evolution. Using an appropriate reward 
function and a suitable time schedule for decision making, 
the action space can be searched efficiently using a reinforce¬ 
ment learning algorithm based on standard learning techniques 
[ |26] , and suboptimal control with quick convergence can be 
achieved. By enabling each agent to be associated with a cost 
or a reward, we allow the isolation and control of individual 
nodes through this modification. In conclusion, the proposed 
network control and network design processes form a toolbox 
for the micromanagement of diffusion in networks. 

The remainder of the paper is structured as follows. In Sec¬ 
tion [!I] we introduce two essential classes of stochastic update 
rules that are able to model diffusion in networks, and derive 
the corresponding linear system of ordinary differential equa¬ 
tions that describe the network dynamics for the corresponding 
conservative and non-conservative networks. In Section [HI] 
we further discuss the stability and convergence characteristics 
of these differential equations. In Section |IV| we extend the 
homogeneous equations to their inhomogeneous forms, and 
discuss their stability and convergence characteristics. In Sec- 
tion[VJ we look at how external excitation can possibly achieve 
control, and in Section VI we discuss the impact and possible 
implementations of network structure modification. In Sections 
0 and VlJ we also provide case studies to illustrate how these 
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control processes can be implemented. In Section 
conclude our work by looking at some future directions and 
extensions. 


II. System model and stochastic update rules 


We consider a population V of interacting agents Vi , where 
i £ X = {1,2,... ,n : n £ Z + }. These agents have the 
capacity to generate, destroy, store, transmit and receive a 
continuous quantity or node property Si(t) £ R that can 
change with time t > 0. We denote the vector of properties by 
S(t) = {Si(t) : i £ I}. Given some set of initial conditions 
iS. ( (0) = Si t o for some Si t o £ R, as well as probabilistic 
interactions between the agents, the system evolves over time 
and the property S, (t) of each agent can be monitored over 
time. The probabilistic interactions between the agents can be 
described by a weighted digraph G = ( V,£,w) where V is 
the set of agents, £ is the set of directed links between pairs of 
agents in V, and w: £ —> R + is a function that represents the 
weights. The weighted adjacency matrix can be represented as 




w(i,j) if (i,j) £ £ , 

0 otherwise . 


( 1 ) 


Furthermore, we define the weighted in-degree matrix Dq ^ 
and the weighted out-degree matrix Dq U ^ as the diagonal 
matrices with 

= '52 A G( i ,j ), ( 2 ) 

and 

D G Ut \i,i) = ^2 Aa(i,j) ■ (3) 

3 

The exact meaning of the edge weight and the direction of 
the links will be made explicit when we analyze the update 
rules, referred to as protocols, in the following Section. Since 
the interactions between agents can be asymmetric, we will 
introduce two different Laplacians that refer to the in-degree 
and the out-degree of each node. We define the weighted in¬ 
degree Laplacian as 

lP*\G) = D™ - Ac . (4) 

Conversely, the weighted out-degree Laplacian is defined as 

L (out) (G) =D < g ut) - A g . (5) 

Depending on the choice of the stochastic update rule used 
in the probabilistic interactions between the agents, we will 
characterize the flow dynamics of networks operating under 
different protocols. Here, we describe two main classes of 
linear update rules that result in linear, time-invariant differen¬ 
tial equations in the node property and corresponding linear, 
time-invariant matrix differential equations in the diffusion 
probabilities [[] These update rules are distinguished by the 
conservation or the non-conservation of the total property 
initially present in the network. The networks applying the 
conservative and non-conservative protocols will be referred 
to as conservative and non-conservative networks respectively. 

A. Conserx’ative networks 

We first consider a protocol where the total amount of 
property in the network is conserved at every instant in 
time. Conservative updating is relevant for the description of 
conservative flow dynamics in a network, including the flow 
of material and physical assets. In this respect, conservative 
networks are able to represent stylized instances of hydraulic, 
financial, or trade networks. 

In conservative networks, agents obey the conservative up¬ 
date rule 


Si(t + At) — Si(t) + CijSj(t) 

S j (t + At) = (l-C ij )S j (t), (PI) 

where i,j £ I, ( i,j) £ £. The parameter Cij £ (0,1] is a 
measure of liability or responsibility of agent j towards agent 
i, and At is an infinitesimal time interval. For every edge 
[i, j) £ £, there exists a clock that follows an independent 
Poisson process with rate ry > 0. The protocol < | PI [ > is 
executed for nodes i and j when the independent Poisson clock 
of (i, j) ticks. The following Lemma characterizes the property 
dynamics of the instance-averaged value of S in conservative 
networks. 


^or non-linear update rules, we refer to l 27 
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Lemma 1. Let S(t) denote the expected value of S averaged 
over all sample paths. The dynamics of the expected property 
for a network applying <ED are defined by 

5(f) = QS(t), (A) 

where Q = —7/ ln ) (G), the weighing function is defined as 
w(i,j) = Cijrij, and 


Q 


ij 


Gij t'ij if i f j , 

if i = j . 


( 6 ) 


Proof: We first note that the total update rate for an node 
* € V is given by r\ = )T/ rij , and that the total update rate 
of the network is given by r = r,. Assume that a global 
network clock is ticking at rate r. Then, the probability that the 
clock will activate edge (i. j) is given by r^/r, where in the 
limit of large-scale networks r ~ 1/A t. Consequently, when 
property updating follows the probabilistic update of the 
property of the nodes corresponding to the edge (i. j) is given 
by 


Sfit + At) - Si(t) = ^2 r tj AtCij Sj(t) (7) 

i/i 

Sj(t + At) - Sj(t) = -rij AtCij Sj(t ), (8) 

which can be succinctly written as 

Si(t + At) -5;(f) = At(rijCijSj(t) -rjiCjiSi(t)). (9) 


Dividing by At and taking the limit for At —t 0, we get a 
system of differential equations 

Si(t) = ^ rijCijSjf) - rjiCjiSfit) , (10) 

or written differently 

5(f) = -LW(G)S(t), (11) 

which concludes the proof. Q 

We will refer to (|A]l as the governing equation. 

Corollary 1. The matrix Q = - lJ'" > (G) of a conservative 
network represents the transition rate matrix of a continuous¬ 
time Markov chain (CTMC). The transition probabilities Vij(t) 
of the CTMC are solutions of the following differential equa¬ 
tion 

V(t) = QV(t),V(t) = [P ij (t)]. (12) 

Proof: From ([6]), we notice that the 

Q,., ■- - v g„ V j'6l, (13) 


such that the columns of Q sum to zero. It follows that Q 
represents the transition rate matrix of a positive recurrent 
CTMC. Accordingly, the spread of property from one agent 
to another can be described by a CTMC whose state space is 
equivalent to the agent set V and whose instantaneous state 
is denoted by X(t). Given that the diffusion dynamics can be 
described by a CTMC, we can write {28 ] 

Pr[X(t + u) = i] = yy Vij(u) Pr[X(t) = j] ,V u€(0, oo) 
j 


for some state »gl, where 

Vij{u) = Pr[X(t + «) = i\X(t) = j] (15) 

describes the probability that an infinitesimal piece of property 
is in state i (i.e. agent i) at time t + u, given that it was in 
state j (i.e. agent j ) at time t. From ( fl4| , it follows that the 
transition probability matrix V = [Pij\ satisfies 

V(t + u) =V(t)V(u) V f,u€ (0, oo), (16) 

which is known as the Chapman-Kolmogorov equation. As¬ 
suming the limits lim^o V{h) = 'P(O) and lim^o V(h) = 
P(0) exist, we can then adopt the relation Vlff) = I where I is 
the identity matrix, and thereafter the following deterministic 
differential equation 

V(t) = QVit) (17) 

where Q = V(0) = [Qij\. □ 

We can interpret ( [12| ) as an equation describing the diffusion 
of transition probabilities between the various agents over time. 

Note that Lemma [T] can also be demonstrated via a proba¬ 
bilistic interpretation of the CTMC. It can be shown that V(t) 
is continuous in t, Vf > 0 | [28] , For a conservative network, 
it follows from ( fi~3j ) that the columns of 'P(t') always sum to 
zero. This shows that for each origin agent j , the sum over 
all the agents i of the time-cumulative transition probabilities 
Vi j is conserved at a value of 1 over time, as expected 
from conventional laws of probability. We now consider the 
expected value of S, averaged over all sample paths, which is 
given by the column vector 

S(t) = V(t)S( 0). (18) 

This illustrates that the sum of S over all agents remains a 
conserved quantity when © is applied. If the system starts 
from a known, deterministic state, then the bar in 5(0) can be 
omitted. Combining ( fl2| and ( | 1 8j >, we find again 


B. Non-conservative networks 

We now consider a protocol where property diffuses be¬ 
tween agents by means of a convex updating protocol. This 
protocol is of interest for opinion dynamics 0, |jg, or 
preference dynamics in cultural theory |[29 |. 

In non-conservative networks, agents obey the following 
convex update rule |[T5j| 

Si(t + At) = CijSjif) + (1 — Cij)Si(t) , (P2) 

whenever the Poisson clock ticks for a pair of agents %. j £ I, 
(i,j) £ £■ I n other words, when the (*,j)-th Poisson clock 
activates the link between agents i and j, agent i is triggered 
to poll the property value of agent j with a confidence 
measure of Cy and update its own value accordingly. The 
following Lemma characterizes the property dynamics in non¬ 
conservative networks. 


Lemma 2. The dynamics of the expected property for a 
network applying are defined by the governing equation 

S(t) = QS(t), (A) 

where Q = ~L^ out \G), w(i,j) = CijTij, and 


Q 


ij 


C'ijt’ij 


n. .r-.. 


if 


(14) 


(19) 
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Proof: When property updating follows ( |P2[ >, the proba¬ 
bilistic update of the property of node Vi £ V is given by 

Si(t + At) - Si(t) = y^X r ijAt)Cij(Sj(i) - Si(t)). (20) 

Dividing by Ai and taking the limit for At —y 0, we obtain 
the instance-averaged linear differential equations represented 
by <jA} with Q = -L^ out fG). O 

Note that Lemma [2] extends the basic consensus algorithm 
where S, (t) = (<%(£) — <Sj(t)), with J\f t the neighbor¬ 

hood of i, to asymmetrically weighted updating. The relevance 
of the asymmetry will be further discussed in Section III. 

Corollary 2. The matrix Q = (Cl) of a non- 

consen’ative network represents the transition rate matrix of 
a CTMC, and the transition probabilities of the CTMC are 
solutions of the following differential equation 

V(t) = QP(t ), (21) 

Proof: The proof is similar to the proof of Corollary 1, 
and follows from the fact that 

Qu = ~J2 III. * V ' eI - ( 22 ) 

meaning that the rows of Q all sum to zero. The CTMC 
corresponding to Q represents a random walk on G. Q 

In comparison with the transfer of an infinitesimal piece of 
property in a conservative network, P tJ in non-conservative 
networks represents the probability that node i polls node j 
to update its own value using the weight C,; 7 . For networks 
applying ( |P2| >, we note similarly that the rows of 'P(t) always 
sum to zero, and consequently the rows of V(t) always sum 
to one. This shows that every polling tag originating from an 
agent i must either still be under the possession of agent i or 
of some other agent j in the network. 

Remark 1. Lemma [7] and Lemma [2] make the link between 
the protocols 0 and © and the transition rate matrices 
explicit. In the case of a symmetric Q matrix, the in-degree and 
out-degree Laplacians are identical. Hence, the conservative 
and non-conservative protocols are equivalent for symmetric 
matrices. In the event of asymmetry, we observe from Q and 
0 that the consen’ative Q and the transpose of the non¬ 
conservative Q are identical if the tw’o corresponding digraphs 
have the same weights w but with all link directions reversed. 

III. Homogeneous equation: Stability and 

CONVERGENCE 

In this Section, we analyze the steady-state characteristics 
of the homogeneous equation ([A} based on the eigendecom- 
position of the rate transition matrix Q. The relevance of Q 
can further be understood by considering that ([12} and 0 
can be solved as V(t) = exp(Qf), indicating the role of 
the topology, the inter-agent measures of confidence, and the 
updating rates in the diffusion process. Due to the construction 
of Q as a Laplacian matrix, Q always has the eigenvalue 
q s = 0. Moreover, Q is a Metzler matrix, such that exp(Qt) 
is nonnegative for t > 0. If Q is diagonalizable, then we can 
write 

S(t) = exp (AAA _1 f) 5(0) 

= Adiag(exp(g fe f)) A _1 5(0), (23) 


where A contains the unit right eigenvectors of Q as columns, 
A -1 contains the corresponding left eigenvectors of Q as rows, 
gfc represents the eigenvalues of Q, and A = diag(r/y, : ). We can 
further write 

•5(f) = X]expfei)vR,fcWL*,fc^(0) 

k 

= Ck exp(gfcf)^R,fc , (24) 

k 

where and ig,-,k are the unit right and corresponding 
left eigenvector^] of Q in column form, and c fe = v l 
are scalars. We refer to the eigenvectors of Q as the system 
modes. From 0 the eigenvalues and eigenvectors reflect the 
characteristic growth and decay rates of the system, as well 
as the dominant and weak nodes in the network. In particular, 
the quantities of the network nodes are stable and converge to 
a steady-state distribution in finite time iff the eigenvalues of 
Q are nonpositive and include zero. Note also that the steady- 
state behavior of the nodes is given by lim^oo S(t) = c s ur jS , 
where c s and wr iS are the scalar and the unit right eigenvector 
corresponding to q s = 0. As per Gersgorin’s circle theorem, 
transition rate matrices Q constructed according to or 
( |P2} and associated with CTMCs satisfy these properties. In 
the following subsections, we will illustrate the dynamics for 
different strongly connected network classes by means of the 
network presented in Fig. [I] 

a. a. a a 

Fig. 1: Path graph P 5 with symmetric links of unit weights used to illustrate 
the transient and steady-state behaviors of the various update rules. 


A. Conservative asymmetric networks 

For conservative networks, we formulate the following 
Lemma for the stationary behavior. 

Lemma 3. The stationary value of a strongly connected 
asymmetric network applying 0 is given by the unit steady- 
state right eigenvector vr, jS scaled by c s = ‘S'i.o/^ where 

Tf is the sum of the entries o/vr )S . 

Proof: In networks that apply ([FT}, the column-sum of 
Q = — is zero, such that Q has a unit steady-state 

left eigenvector ul, s with equal components. In an asymmetric 
network, the stationary node values are in general imbalanced 
due to the unequal components in the unit steady-state right 
eigenvector vr jS . Considering ([24}, the proof is concluded. □ 

In the case of asymmetric liabilities between nodes, the 
stationary distribution will favor attractive nodes and shun 
repulsive nodes, and uniform spreading of the property over 
all the nodes does not occur in general. This is a consequence 
of the conservation of total property after every update. We 
illustrate^ this for the network in Fig. [T] with a = 0.2 in the 
plot of Sift) in Fig. [ 2 ] where we observe that the steady-state 
properties for each node are different. In the latter plot, we 
generate Si(t) by both empirically generating sample paths 
based on the update rule ( |P1}, and analytically determining 
the expected property from ([24}. 

2 The left eigenvectors ^ are only unit in the case where Q is symmetric. 
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Fig. 2: Comparison of instance-averaged node property obtained through 
Monte Carlo simulations (dotted lines) and expected property obtained by 
eigendecomposition of the matrix Q (solid lines) for the asymmetric network 
in Fig. [T] with a = 0.2 under ED For the simulations, 5000 trials were 
performed, and each timestep was discretized into 1000 sub-units. In both 
cases, the initial conditions S (0) = [00001] were adopted. 


B. Non-conservative asymmetric networks 

For networks that apply the convex update rule ( |P2| >, we 
formulate the following Lemma for the stationary behavior. 

Lemma 4. A strongly connected asymmetric network follow¬ 
ing the convex update rule ( |P2| > always achieves consensus. 
The final consensus value c v = c s /y/n in the infinite time 
horizon is the weighted average of S( 0), where the weights 
are the corresponding entries of the unit steady-state left 
eigenvector scaled by the factor I/O, where Q is the 

sum of the entries in ul, s - In other words, 

Cv = ^< B S(0) ■ (25) 

Proof: Since the row-sum of Q = —L^ out \G) is 
zero, Q has a right unit eigenvector with equal components, 
demonstrating that strongly connected networks with possibly 
asymmetric updating always achieve agreement. In order to 
determine the final consensus value, we need to determine 
for each node the constant c s that corresponds to the initial 
conditions enforced on the system. This involves finding the 
row of A~ x corresponding with q s = 0. For an asymmetric 
matrix, this corresponds to the unit steady-state left eigenvector 
of Q scaled by the factor -fifi/il, since the dot product of this 
vector and the unit steady-state right eigenvector must equal 1 
based on the initial conditions V(0) = I = A~ x A. The result 
is then scaled by the magnitude of the entries of r>R iS to obtain 
c v . n. 

We illustrate this for the network in Fig. [T] with a = 0.2 
in the plot of <S,(f) in Fig. [T] In the latter plot, we generate 
iSj(f) by both empirically generating sample paths based on 
the update rule (]P2| >, and analytically determining the expected 
property from \24\ . In this case, the final consensus value is 
heavily weighted towards node 5, whose inward link weight 
exceeds its outward link weight. This means that node 5 is 
heavily polled by its neighbors, resulting in its higher weight 
in the final consensus value. Such a trend is reminiscent of the 
availability heuristic, or availability bias, where subjects that 
are encountered or recalled more often are given more thought 
and emphasis ]30). 



Time 

Fig. 3: Comparison of instance-averaged property obtained numerically (dotted 
lines) and analytically (solid lines) for the asymmetric network in Fig. 
[T| with a = 0.2 under |P2) . The simulation parameters are identical 
to those in Fig. [2] In this case, the unit steady-state left eigenvector is 
[0.0016 0.0078 0.0392 0.1960 0.9798] and the unit steady-state right eigen¬ 
vector is -^=[11111], which gives us = 1.2244 and the corresponding 
consensus value c v = 0.8003. 



Fig. 4: Comparison of instance-averaged property obtained numerically (dotted 
lines) and analytically (solid lines) corresponding to the symmetric network 
in Fig. n\ with a = 1. The simulation parameters are identical to those in Fig. 

0 


C. Symmetric networks 

In a strongly connected symmetric network, the steady-state 
right eigenvector has components of equal magnitude in all 
dimensions regardless of the update rule. We can interpret 
this as an equal sharing of resources in conservative networks 
and consensus among all agents at a value corresponding to 
the average of the initial conditions at all the nodes in non¬ 
conservative networks GD, ED-ODD- We illustrate this for 
the network in Fig. [l] with a = 1 in the plot of Sift.) in Fig. [Tj 
In the latter plot, we generate Si(t) by both empirically gen¬ 
erating sample paths based on the update rules ( |P1| ) and ([P2]», 
which are equivalent in this case, and analytically determining 
the expected property from ( |24| . 

D. Switching topologies 

There are many important scenarios where the network 
topology can change over time. As an example, connections 
in wireless sensor networks can be established and broken due 
to mobility. In this Section, we study the steady-state behavior 
of networks with switching topologies and provide a condition 
that leads to convergence under dynamic topologies. Networks 
with switching topologies converge if they exhibit the same 
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invariant stationary value as defined in Lemma [3] and Lemma 4j 
which depend on ur s and v; Ij S for ( |F 1 1 and ( |P2[ ) respective y. 

We introduce now two classes of strongly connected, di¬ 
rected graphs for which the weighted Laplacian shares the 
left or the right eigenvector corresponding to q s = 0. For 
conservative networks, we have 

f/jp = {G(V,£,w) : rank L^(G) = n - 1, 

i (in) (G)xr R , s =0}. (26) 

For non-conservative networks, we define 

gjp = {G(V,£,w) : rank L^ out \G) = n - 1, 

» L , s xi M (G)=0}. (27) 

Correspondingly, we define the following sets of matrices with 
equal steady-state eigenvectors as follows 


Proof: The proof of this corollary builds on the Lyapunov 
function V(S) = l/25{t) T 5(t). The first derivative can be 
written as 

V(S(t)) = 1/2 (S(t) T Q T k 5(t) + 6(t) T Q k 5(t)) , (32) 

such that V(5(t)) < 0 due to the assumptions made. Using 
the Courant-Fischer theorem, and similar to G3’ the proof is 
concluded. Q 


Remark 2. A relevant question is if it is possible to verify 
if two matrices Q\ and Q 2 belong to the same set M$, 
x <E { |/ J 1 1 / J 2[ ). The common eigenvector problem can be 
solved when the eigenvalue is not known For the 

special case where the common eigenvalue is equal to zero, two 
matrices Q 1 and Q 2 have the same eigenvector corresponding 
to q s = 0 if 

null(Qi) = null(<2 2 ) • (33) 


M|D = {Q:Q = -L (in) (G), Q x vr, s = 0} 

= {Q : Q = -L (out) (G), u LjS x Q = 0} . (28) 

For these classes of networks, the system dynamics are de¬ 
scribed by 

S(t) = QkS(t) , (29) 

where Q k € M^H when (JPTJ is applied, and Q k € A/f-0 
when ( |P2| ) is applied. The index k changes over time according 
to k = /(f), /: R + —X IMi where Tm is the index set of the 
corresponding set of matrices. 

Lemma 5. Consider a network with network dynamics as 
described in m The networks that are strongly connected 
and follow 4ED or (|P2b are globally asymptotically convergent 
if Qk belongs to 3 or ik/jp), respectively. 

Proof: The proof follows from the fact that all networks 
that belong to M$-3 or M$-3 converge to the same invariant 
quantity, as they share the same right and left eigenvector, 
respectively. As all elements of ME I and are globally 

asymptotically stable, this concludes the proof. 0 

We introduce now the distance vector S = S(t ) — c s v r jS . 
As c s ur iS is an equilibrium of the system, we can write the 
distance dynamics as follows 

5(t) = Q k 5(t). (30) 

We cannot make a general statement about the definiteness 
of asymmetric matrices with non-positive eigenvalues. For the 
matrices belonging to Mjp or M$p that are negative definite, 
we can bound the convergence speed of the distance vector as 
follows. 

Corollary 3. Consider a strongly connected graph for which 
the weighted in-degree or out-degree Laplacian is negative 
definite. For these networks, the distance vector globally 
asymptotically converges to zero and the convergence speed 
can be bounded as 

P(f)|| 2 < II*5(0)|| 2 exp(g max f) (31) 

where q max = max <72 (A/JP) for conservative networks and 
<Zmax = max q- 2 ( A/jEZ) ) for non-conservative networks. Here, 
<72 is the non-zero eigenvalue with the smallest absolute value 
of the corresponding transition rate matrix. 


IV. Inhomogeneous equations: Stability and 

CONVERGENCE 

To model the exogenous addition and subtraction of property 
to multi-agent networks, we extend the homogeneous equation 
© to include an inhomogeneous term 

S{t) = QS{t)+U(t), (34) 

where U is an arbitrary input vector. Since the inhomogeneous 
equation 0 is linear in the exogenous input term, the 
exogenous input can either be constant over all sample paths or 
averaged over all sample paths. Both interpretations will result 
in identical instance-averaged inputs. For linear, time-invariant 
systems, the system response can be derived in closed form. 
Note that the solution to © with initial conditions of the 
network state 5(0) is given by Shit) = exp(Qf)5(0). It can 
be shown that the general solution to ( |34| is given by |25| 

S(t)=S h {t)+ [ exp(Q(f — T))Uir)dT. (35) 

Jo 

This can be interpreted as the sum of the solution to (|A]) and 
the convolution of the input U with the impulse response to 
( |34| for every node in the network. 

We now discuss several important use cases for the input 
vector U and show that many scenarios can be described 
as in ( |34| . In particular, we will demonstrate that networks 
containing stubborn agents G3 and networks with dynamic 
learning ID can both be expressed in the form of the 
inhomogeneous governing equation with constant and dynamic 
U respectively. This motivates the study of stability and 
convergence in the presence of exogenous inputs. 

A. Non-conservative networks with static inputs 

Adopting the definition in | fl5) , as well as the consensus 
update rule 0. a stubborn agent does not adjust its own 
value based on the values of neighboring nodes. This scenario 
is of interest to model opinion dynamics where a set of agents 
has constant opinion. In these networks, the out-degree of a 
stubborn agent is zero and the corresponding row of Q is a 
zero-row]/] For example, in the presence of two stubborn agents 

3 Conversely, if the in-degree of an agent equals zero, the node acts as a 
sink, which can receive property from every neighbor but does not contribute 
to his neighborhood. This case, where Q contains a zero column, does not 
lead to a convenient reduction of the state space. 






CHAN et al.: CHARACTERIZATION AND CONTROL OF DIFFUSION PROCESSES IN MULTI-AGENT NETWORKS 


7 


a i = 2 and 02 = n, the governing equation can be written as 
follows 


5r 

S n . 


2 1.1 2 l ,2 

0 0 

23.1 23,2 


0 0 

which can be reformulated as 


Ql,r 

0 

23,, 


5i 

Sri 


(36) 


' ■ 


CS L 

0 ? 

01 

1_ 

2] ,n—2 


1 

■ 

1 
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_2n_2,l Q'n-2,2 ' 

’ Q’n-2,n-2_ 


1 

10 

1_ 




, (37) 


where lik.a, and /i/,, „ 2 are the ai-th and 02 -th columns of Q 
with the ai-th and 02 -th entries omitted. The entries of these 
columns depend both on the network structure and the update 
profile, and we get 

S'(t) = Q'S'{t)+Bu, (38) 

where S' and Q' represent the reduced state space and reduced 
transition rate matrix, and u = [<S ai • • • <S a J T contains the 
static quantities of the stubborn agents. We notice from ( [38] ) 
that the presence of the stubborn agents in the network leads to 
a reduction of the state space and a formulation equivalent to 
( |34] > with constant but possibly different inputs for different 
stubborn nodes, and U = Bu. The reduced state space 
excludes the stubborn agents and the reduced transition rate 
matrix excludes the rows and columns corresponding to these 
agents. Note that the inhomogeneous equation ( [34] ) with static 
inputs corresponds to typical open-loop control systems. 

Since the input vector U is constant, it is known that 
bounded-input bounded-output (BIBO) stability of ( [38] ) is 
guaranteed iff all eigenvalues of Q' have negative real com¬ 
ponents. In addition, since Q 1 is constructed differently from 
Q, the convergence of the homogeneous and inhomogeneous 
systems is not identical. Define I st = {i: Sift) is constant} 
as the index set of stubborn agents. Then, we can write 

2zz = - Qij - 'Ej&jeu and the row sum 

of Q! satisfies 


Q'u + E &U ^ 0 ■ (39 > 

As a consequence, Q' does not have a zero eigenvalue in 
general and is nonsingular in general. In the following Lemma, 
we provide a condition that guarantees the non-singularity of 
2'. 

Lemma 6. In a network with stubborn agents, the transition 
rate matrix Q’ of the reduced state space is invertible if all 
nodes are connected to at least one stubborn agent. 

Proof: If each agent is connected to at least one stubborn 
agent, Q' is strictly diagonally dominant and we can write 

2;,- ,K (40 > 

jAi 


According to Gersgorin’s circle theorem, all eigenvalues of 
Q! reside in the complex plane within the union of the disks 
Di = {z e C : \z- 2-i| < E«i |2'j|}, 1 < i < ri with 
n' the dimension of Q!. As \Q' U \ > \Q'ij\ for a strictly 

diagonally dominant matrix, Q' cannot have a zero eigenvalue 
and is invertible. O 

From Lemma [6] we notice that for networks with stubborn 
agents the eigenvalues will on average be more concentrated 
around Q' u , which can accelerate the network dynamics. For 
the convergence of a BIBO-stable reduced system involving 
Q! with the conditions of Lemma [6] satisfied, we formulate 
the following Lemma. 

Lemma 7. Let Bu = b. If each agent is connected to at 
least one stubborn agent, then the network state converges to 
S'* = — Q! 1 b. We can then express the system dynamics as 
a homogeneous equation in the deviation of the property of a 
node from its steady state S A = S' — S'*: 

S' A (t) = Q'S' A (t). (41) 


Proof: The proof of this Lemma follows from traditional 
convergence results similar to those for the basic consensus 
algorithm 0, 03-11191 since the inhomogeneous equation 
in S' can be expressed as a homogeneous equation in S' A (t) 
and the poles of Q' lie in the open left-half complex plane. 

Global asymptotic stability can be demonstrated for sym¬ 
metric Q' by using the standard Lyapunov function V = 
S A S A . In this case, we obtain 

V = S A Q' t S a + S A Q'S' a <0 


since Q' and thus its transpose are negative definite. This 
demonstrates that the equilibrium point of the network is 
globally asymptotically stable, and the location of the point 
can be obtained by setting S' A = 0. 

The Lyapunov function above does not hold for nonsymmet- 
ric Q'. However, because Q' is strictly diagonally dominant, 
the existence of a Lyapunov function for Q' can be demon¬ 
strated to prove the global asymptotic stability of S' 133 ]. One 
such Lyapunov function is V = max,; -|, which proves the 


global asymptotic stability of S'. Jj 

Note that in the general case of conservative or non¬ 
conservative updating without stubborn agents, Q is con¬ 
structed such that it has a steady state eigenvalue q s = 0. As 
a consequence, Q is not invertible. If Q (or Q') is singular, 
then BIBO stability is no longer guaranteed, and the system 
is instead only marginally stable. 


Lemma 8. A non-trivially convergent network where q s = 0 
exists, diverges in infinite time in the presence of a constant 
input vector b. 

Proof: Consider the convolution integral in ( |35| - Since 
Q is not invertible, we perform the convolution integral by 
expanding the matrix exponential in a power series and per¬ 
forming an eigendecomposition of Q. Then, since the steady- 
state eigenvector does not decay with time, the integral does 
not converge in infinite time. □ 


B. Non-conservative networks with dynamic inputs 

In the case of dynamic learning, each node updates its own 
value based on the value of its neighboring nodes, as well 



















as the difference of its own value with a measurement of the 
system state (19) , which can be interpreted as a reference input 
for the node. This scenario is of interest to model tracking 
systems that are augmented with the information present in 
the neighborhood. This can be represented by the following 
update rule 

Sft+At) = Cij(Sj(t)—Si(t))+P(t)(Xj(t)—Si(t)), 

(42) 

where Xft) represents the measurement of node i at time t. 
If the measurement of each node is made at a rate p, then the 
corresponding governing equation is 

S(t) = (Q-0'I)S(t)+0'X(t), (43) 

where 0 ' = 0p, and where r,; ? = p in the construction 
of Qij. This is again reminiscent of ( |34| ) with time-varying 
U = 0'X(t). The inhomogeneous governing equation with dy¬ 
namic reference inputs corresponds directly to typical closed- 
loop proportional control systems with reference input X and 
proportional gain 0 ’. 

Analogous to the case of systems with stubborn agents 
above is the BIBO stability criteria for the system with 
dynamic learning and constant 0 ’: iff all eigenvalues of 
Qd = Q — 0'I have negative real components, then an input 
0'X(t) that is bounded for all time t > 0 will result in BIBO 
stability. In addition, we formulate the following Lemmas for 
the convergence of a BIBO stable dynamic learning system 
with constant 0'. 

Lemma 9. If Qd A invertible and lim^oo X (t) = X* 
for some real-valued constant X*, then the network state 
converges to S* = — Qd~ X 8 ' X*. 

Proof: The proof of this Lemma is analogous to the proof 
of Lemma □ a 

Remark 3. If Q corresponds to a CTMC and 8 ' > 0, then the 
eigenvalues of Qd are exactly 0 ’ less than the eigenvalues of 
Q, and the resultant system is guaranteed to be BIBO stable, 
provided 8 ' takes the same value for all agents. 

In the more general case where X(t) is not an independent 
measurement, but a function of Sa, we can formulate the 
following Lemma. 

Lemma 10. If Sa reaches equilibrium much faster than X, 
then we can approximate X{t) = H(X,Sa) as X(t ) ~ 
h(X, 0) for any arbitrary function h and regardless of whether 
X eventually converges. The time evolutions of Sa and X 
can then be approximated as independent even if they were 
originally coupled. 

Proof: This Lemma follows directly from the moving 

equilibrium theorem. Q 

C. Dynamic non-reference inputs 

Consider the following variation of ( |43] > 

S(t) = QS(t)+lA{t), (44) 

where U(t) is some arbitrary time-varying input vector and 

Q can be used to described a CTMC. The inhomogeneous 


governing equation with dynamic non-reference inputs cor¬ 
responds to typical open-loop control systems. Since Q is 
singular, the system is only marginally stable and not BIBO 
stable. A marginally stable system is known to give a bounded 
output if its input is an impulse of finite magnitude. By the 
principle of superposition, a continuously finite input that 
eventually decays to zero will also give a bounded output. 

D. Expanded state space 

Consider the following variation of ( |42) 

Si(t+At) = Sj(t)+y^ y Cjj(Sj(f)—Sj(t))+ 8 i{Xj(t)—Sj(t)) 

+ (45) 

where Y t (f) = Xft) and 'Tift) = Sift). This leads us to the 
coupled equations 

f(t) = S(t), (46) 

(1 + 0 ' 2 )S(t) = (Q - 0[l )S(t) - 0 ' 3 T{f) (47) 

+ 0 [x(t) + 0 , 2 x(t) + 0 , 3 Y(t), 

where 0[ = 8 i P, 82 = 82 P and 83 = 03 P- The inhomogeneous 
governing equation with an expanded state space corresponds 
to typical closed-loop proportional, integral and derivative 
(PID) control systems. Here, 81 - 03 and 82 correspond to P, 
I and D controls respectively. The expanded state space now 
includes the components of both S and T, and we can write 
this as a single relation 

(i+ 0' 2 )i f(t) + (8[i - o)f(t) + 8' 3 mt) 

= 0[Y(t)+0 , 2 Y(t) + 8 , 3 Y(t). (48) 

Performing a Laplace transform and then rearranging, we 
obtain 

f = (I3' 2 s 1 +I3' 1 s+Ii' s ) [(1 + + Oil - Q)s + ftl] , 

* (49) 

where the hat above a quantity denotes a Laplace-transformed 
quantity. The stability of this system is determined by the 
roots of the characteristic polynomial of the system, which 
can be obtained from the determinant of the matrix Qe = 
[(1 + 02 )Is 2 + (0'f — Q)s + 0' 3 l\. From standard state-space 
theory, if all the poles of the system, or the roots of the 
characteristic polynomial of Qe, lie in the open left-half 
complex plane, then the system is BIBO stable. If at least 
one of the poles lies on the complex axis, then the system 
is instead marginally stable. Depending on the values of 0f 
02 and 8 f the convergence of a BIBO stable system can be 
derived. For example, if 0 2 = 0 ' 3 = 0, we recover the results 
of Lemma [9j on the other hand, if 0 3 f 0, the final value 
theorem tells us that taking the limit s —> 0 of sS gives us the 
steady state value S.^ = X^. 

V. Control by exogenous excitation 

Whereas we studied in Section [TV] the convergence and sta¬ 
bility under exogenous inputs, we demonstrate in this section 
how the rates of property diffusion can be controlled through 
appropriate design of a generic time-varying input vector U(t). 
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A. Using inputs for targeted control 

We first establish a methodology for decomposing the 
existing system dynamics into a more convenient form for 
targeted control of individual system modes. We summarize 
this decomposition, as well as the subsequent reconstruction 
for implementation in real networks, in Algorithm [l] 


Algorithm 1 Implementing inputs for targeted control 

1: Transform basis of nodes to quasi-node space using char¬ 
acteristic eigenvectors as in ( |50| ) 

2: Perform a Laplace transform of the system to move into 
the frequency domain 

3: if implementation of external controllers is feasible then 

4: Implement controllers on one or more quasi-modes to 

achieve control objective 

5: else 

6: Design quasi-inputs by incorporating controller action 

on one quasi-mode 

7: end if 

8: Perform an inverse Laplace transform back into the time 
domain 

9: Transform quasi-nodes back to node space to obtain de¬ 
sired node inputs or system dynamics 




Fig. 7: Simplified block diagram for the inhomogeneous transformed govern¬ 
ing equation for a single quasi-mode with quasi-property si = qiSi + Ui. 
The tilde above a quantity and its lower case denote a quantity in quasi-node 
space. 


After applying the basis transformation, we can tune the 
quasi-modes individually to achieve the desired control objec¬ 
tive using one of the two following approaches. We can either 
design a controller acting on one or more of the quasi-modes 
to steer the system response, or modify the quasi input such 
that it incorporates the controller action. In the first approach, 
we design a closed-loop feedback control block to adjust the 
system response to different input signals. These include the 
implementation of filters and lead-lag compensators to modify 
the frequency and phase responses. Conversely, we can design 
the quasi-input v! i = u'(v,i) for all quasi-modes i in order 
to bring the quasi-outputs to their desired values. This open- 
loop approach could be convenient in scenarios where perfect 
measurements of the system state are difficult to make and the 
effects of system noise are not significant. We illustrate this 
with a simple example in Fig. 8] Suppose we wish to mimic 
proportional or integral closed- oop control on the quasi-input 
u y corresponding to quasi-mode y using a feedback block F 
with transfer function I\ or K/s respectively, where K £ R. 
We then subsume the control loop into a quasi-input using the 
Laplace-transformed relation 


Fig. 5: Block diagram for the inhomogeneous equation E}- The hat above a 
quantity denotes a Laplace-transformed quantity. 

Figure [5] demonstrates that ( [34] ) together with initial con¬ 
ditions <S(0) can be expressed as a block diagram with a 
feedback loop. It is possible to separate the effects of the 
initial conditions and the external input, which is illustrated 
in Fig. [6] for clarity. In what follows, we study the effects of 
modifying the lower branch in Fig. [6] related to the exogenous 
input without influencing the upper branch. We consider here 


u'(Ui)=Ui ----y— , (51) 

s - q y + b 

which makes the two block diagrams in Fig. [8] equivalent. 
Note that the controller design here is only performed on a 
single quasi-mode y. Yet, the quasi-inputs of all the other 
quasi-modes must be transformed using since all the 

quasi-modes are subject to the same external input. The main 
advantage of the proposed method lies in the possibility to 
isolate parts of the system and separately adjust the response 
times of the different modes without a physical controller. 



Fig. 6: Simplified block diagram for |34| . 

the case where the system has exactly n distinct eigenvalues 
and eigenvectors. We use the decomposition in ( |23j ) and rewrite 
( f34[ ) as 

A~ l S = KA~ l S + A^U , (50) 

which corresponds to performing a basis transformation by 
taking linear combinations of the various nodes. Under this 
transformation, we obtain quasi-modes with quasi-property Si, 
as well as corresponding quasi-inputs Ui that are each acted 
upon by a single eigenvalue q t of Q. The time evolution for 
this transformed system is illustrated in Fig. [7] 



Fig. 8: Block diagrams for quasi-mode y for the original control loop (left) and 
when the loop is subsumed into the Laplace-transformed quasi-input u'(uf 
(right). 

Remark 4. Note that it is possible to account for the case 
where some of the n eigenvalues are degenerate as long 
as Q is diagonalizable. In the event that the algebraic and 
geometric multiplicities of each of the degenerate eigenvalues 
are identical, we can use the same transformation as above, 
except that some of the quasi-modes corresponding to the 
same eigenvalue, yet different eigenvectors, will have the same 
transfer function l/(s — qf). The situation becomes more 
sensitive when Q is defective. In this case, we have to consider 
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the generalized eigenspace of the matrix instead. Using the 
generalized eigenvector decomposition of Q, we obtain a 
complete basis of n orthogonal generalized eigenvectors, and 
Q can be diagonalized in the block Jordan form. Suppose 
we have a Jordan block involving nodes z and z + 1. Then, 

l 2 = g 2 l 2 + f*+1 + u z . 

B. Case study for external control 


for this particular case study. Figure [TO] illustrates the block 
diagrams for these two quasi-modes in the s-domain, along 
with possible closed-loop control blocks ll\ and U\. 



l 



Fig. 9: Asymmetric cycle graph C 4 , a . 


To illustrate how external control on a single quasi-mode 
can be implemented by quasi-input design, we consider a 
simple case study of four nodes in an asymmetric cycle graph 
configuration, as shown in Fig. [9] In this case, the eigenvalues 
of the associated Q are {<71, <72, (73, (74} = {—3, —2, —1, 0}, and 
there are no degeneracies. 

Under the corresponding unit right eigenvec¬ 

tors are A [-11-11], J|[-ioio], wA[o-iot], and 


— y § [ | 1 | 1 ] respectively. Based on the unit steady-state 
eigenvector, we observe that the network on average tends 
to accumulate property at nodes 2 and 4 while depleting the 
property of the other nodes. This motivates us to consider 
the control objective of maintaining property at nodes 1 
and 3 during the transient period to reverse the steady-state 
asymmetry. 

Under the corresponding unit right eigenvectors 

are 3- 1 3]- V^ 10-10 ]* V^°' 10l] and 

A [ 1 1 11 ] respectively. Based on the network structure, we 
observe that the network allocates higher weights to the 
quantities at nodes 2 and 4, which affects the calculation of 
the final consensus value. This motivates us to consider the 
control objective of increasing the weight of nodes 1 and 3 
during the transient period to increase their influence on the 
system. 

For either update protocol, it is sensible to excite nodes 1 
and 3 in order to achieve the control objective. Since each 
quasi-mode is associated with a single pole, the net effect of 
the quasi-mode plant resembles the integration of the input, so 
we inject a unit impulse S(t) into nodes 1 and 3 at t = 0, and 
then apply control to displace the poles and achieve step-like 
outputs. The injection is equivalent to a Laplace-transformed 
input U = [1010]. By performing the aforementioned basis 
transformation on U , we obtain the equivalent quasi-inputs 
{ui,u 2 ,U3,U4} = {— §, 0, 0, —2y^j|r} for the conservative 

network and {€ti, u 2 , u 3, £4} = {—2y^AA, 0,0, |} for the non¬ 
conservative network. Note that quasi-modes 2 and 3 generate 
zero quasi-inputs in both cases due to the asymmetrically 
designed input [1010]. Hence, from here on, we will only 
consider the quasi-modes 1 and 4 with non-trivial quasi-inputs 


Fig. 10: Block d iag rams for quasi-modes 1 (left) and 4 (right) of the system 
illustrated in Fig.M with corresponding Laplace-transformed quasi-outputs si 
and S 4 . The controllers H\ and H 4 represent a feedback control block. 


1) No control: In the absence of controllers (Hi = H,\ = 0), 
we obtain the following results: 

Under ( |P1| >, we obtain the quasi-outputs Si = — |e -34 

and S4 — ^ After transforming the quasi-outputs back 

into node space, we obtain = <§3 = |(1 + 2e~ 3t ) 
for the instance-averaged properties of nodes 1 and 3, and 
^2 = N4 = |(1 — e~ 3t ) for the instance-averaged properties 
of nodes 2 and 4. 

Under ( |P2| >, we obtain the quasi-outputs si = —2 y^^e -34 
and S4 = |, which then correspond to = S 3 = 1(1 + 
2e“ 34 ), and S 2 = S& = |(1 — e -3t ). 

In both cases, the settling time of the outputs to their steady- 
state values is 1/3. We can adjust this settling time using 
proportional control. 

2 ) Proportional control: We perform proportional control 
on quasi-mode 1 by replacing Hi with a fixed constant. We 
illustrate the resultant outputs for some constants Hi in Figures 
lllal and |llbl for the conservative and non-conservative cases. 
By decreasing Hi towards —3, we can delay the onset of the 
steady state, but we also increase the influence of the inputs 
on the system by increasing the total system input. 

3) Integral control: In standard control theory, proportional 
control is associated with a steady-state error relative to the 
input. This causes the outputs to be maintained at a non-zero 
level long after the impulses have ended. Integral control elim¬ 
inates this error and further forces the steady outputs to zero at 
infinite time. We verify this possibly useful behavior in Figures 
I12al and I12bl for the conservative and non-conservative cases 
by performing integral control on quasi-mode 1. The transient 
actions neither influence the total steady-state property in a 
conservative system nor the steady-state consensus value in 
a non-conservative system. However, the magnitude of the 
integral gain can influence the speed of onset of the steady 
state. 

4) Conclusions on external control: We have demonstrated 
that by varying the input functions appropriately, both the 
transient and steady-state perturbations to the system can be 
modified. By careful design cognizant of the characteristic 
timescales of the system, the timescales of the perturbations 
can be varied. In addition, the perturbations can be designed 
with either permanent or temporary effects on the system 
depending on the objectives of the network designer. 


C. Network classification 

Section V-B illustrated that both the external input and 
the structure of the quasi-modes in the system influence the 
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(a) Erdos-Renyi (random graph) (b) Albert-Barabasi (scale-free) (c) Watts-Strogatz (small-world) 

Fig. 13: Histograms of Fiedler vector entries for a single instance of various random networks with 400 nodes using the CONTEST toolbox |34[ . 




Unit Fiedler vector entry 


Unit Fiedler vector entry 


Unit Fiedler vector entry 




(a) Conservative protocol 


(b) Non-conservative protocol 


Fig. 11: Resultant outputs for different H\ for the two network update 
protocols by mimicking proportional control on quasi-mode 1. The solid lines 
indicate the time evolution of properties at nodes 1 and 3, while the dashed 
lines indicate the time evolution of properties at nodes 2 and 4. 


in Fig. 13 to illustrate their varying distributions. We observe 
that the spread of the Fiedler vector entries increases with 
the average network clustering coefficient. Since small-world 
networks tend to have Fiedler vector entries with a larger 
spread, inputs to small-world networks can affect different 
nodes to different extents. The input design considerations for 
small-world networks have to ensure that specific nodes are 
targeted to achieve the desired control objective. 


VI. Control by structure modification 

Instead of controlling the network dynamics by means of 
exogenous excitation, we present in this section two methods 
to steer the dynamics by modifications in the network structure. 




(a) Conservative protocol 


(b) Non-conservative protocol 


Fig. 12: Resultant outputs for different H\ for the two network update 
protocols by mimicking integral control on quasi-mode 1. The solid lines 
indicate the time evolution of properties at nodes 1 and 3, while the dashed 
lines indicate the time evolution of properties at nodes 2 and 4. 


resultant system outputs. The variance in the individual entries 
of the eigenvectors corresponding to the activated quasi-modes 
influence the effects of the external input on the individual 
nodes. For example, an external input on a single node in a 
network where the activated quasi-modes have low variances in 
their eigenvector entries tends to influence all other connected 
nodes to the same extent; conversely, the same input in a 
network with quasi-modes with large variances may have 
different effects on different nodes. 

We provide a simple example here to illustrate this point. 
It is generally accepted that the eigenvector associated with 
the non-zero eigenvalue of the Laplacian with the smallest 
absolute value, the Fiedler vector, provides a reasonable basis 
for spectral graph partitioning (35) . We plot histograms of the 
entries of the Fiedler vector for different classes of networks 


A. Network structure modification through design 

Instead of controlling the system with an exogenous input 
term, we can also modify the network structure directly to 
achieve control. Modifying the network structure may be 
preferable to introducing external excitations in systems where 
it is more convenient to modify the interactions between agents 
than to introduce or remove property from agents. Through this 
modification, we can shift both the eigenvalues and the eigen¬ 
vector components, thereby changing the system dynamics as 
well as its steady-state distribution. In particular, modifying 
the network links causes changes in both the transformation 
basis between node space and quasi-node space, as well as the 
characteristic timescales of each system mode. 

By modifying the links more carefully, we can achieve 
modification of the characteristic timescales without reshaping 
the transformation basis. Recall that Q = A A A _ 3 if it is 
diagonalizable. Apart from decomposing Q into the matrices 
A, A and A -1 , we can also reconstruct Q by multiplying A , 
A and A -1 together in the appropriate order. By holding A 
constant and modifying the entries of A, we can achieve a 
modification in Q without affecting the composition of the 
system modes. The stability of the new eigenvalues of the 
modified Q will determine if the modified Q continues to 
represent a CTMC described by (Ja). In this respect, as long 
as q s = 0 is preserved, the modification of Q will result in a 
CTMC where the columns or rows sum to zero for conservative 
and non-conservative networks, respectively. 

Remark 5. We note that this approach of network structure 
modification through design can be extended to more com¬ 
plex systems. In particular, consider the distributed control 
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Fig. 14: Star graph S$ with symmetric links of unit weights. 



Fig. 15: The superposition of these two graphs yields the network structure 
corresponding to the modified transition rate matrix given in (53). 


system & 


T = Q(P' d T + p'pT + P'jT ), 


(52) 


for some arbitrary gains /3' D , j3' p and f3' P Here, modifications 
to Q will also influence the dynamics of the system. However, 
the key governing parameters are no longer the eigenvalues 
Qk of Q, but instead the roots of the characteristic polynomial 
of the matrix [s 2 (I — fl' D Q) — s{fl' P Q) — fljQ]. 

To illustrate how we can modify the eigenvalues of a 
network to achieve a shift in the system response, we consider 
a symmetric star network depicted in Fig. 
rate matrix Q corresponding to this network 


14 The transition 


k; is 


Q = 


-4 1 1 1 1 ' 

1-10 0 0 
1 0-10 0 
1 0 0-10 
1000-1 


The eigenvalues of this matrix are 0 with multiplicity 1, 
—5 with multiplicity 1 and —1 with multiplicity 3. The 
corresponding eigenvectors are the steady-state eigenvector 
describing equilibration between all the nodes ([i t t t t ]), 
an eigenvector describing diffusion between the peripheral 
nodes and the central node ([-4 t t t t ]), and eigenvectors 
describing diffusion among the peripheral nodes ([o t t t -3], 
[o 0.36 l -1.36 o] and [o -1.36 i 0.36 o]). Suppose we want to 
slow down diffusion between the peripheral nodes and the 
central node. We can do so by decreasing the effective update 
rate of this mode, which causes the relative importance of 
diffusion among the peripheral nodes to increase and results in 
extra links facilitating this diffusion. For instance, by changing 
the effective update rate of the peripheral-central mode from 
5 to 4.5, we obtain the following transition rate matrix 


Q = 


'-3.6 

0.9 

0.9 

0.9 

0.9 

0.9 

-0.975 

0.025 

0.025 

0.025 

0.9 

0.025 

-0.975 

0.025 

0.025 

0.9 

0.025 

0.025 

-0.975 

0.025 

0.9 

0.025 

0.025 

0.025 

-0.975 


(53) 


In the resultant architecture, the links between the central node 
and the peripheral nodes have weights of 0.9 instead of 1, and 
the links between the peripheral nodes have weights of 0.025 
instead of 0. The corresponding network is a superposition of 


the two graphs depicted in Fig. 15 


B. Network structure modification through adaptive control 

In sufficiently large systems, it can be computationally 
intensive to perform the eigendecomposition of various Q 
matrices in order to determine the network structure that gives 
the most desirable system response. There are two possible 
approaches to handle this issue. Firstly, one can choose from a 


set of matrices that deviate from the original Q by small per¬ 
turbations 5Q. The eigenvalue and eigenvector perturbations 
can be derived as a function of SQ for sufficiently small 6 Q. 
However, this limits us to small modifications in the network 
structure that do not span the feasible action space. 

It is possible to examine larger and/or more modifications 
without going through the computational complexity by per¬ 
forming adaptive control driven by reinforcement learning. In 
this approach, we formulate the problem as an MDP. The state 
space X of the MDP is equivalent to the node space of the 
network, while the action space W of the MDP is equivalent 
to the set of possible Q that the network can adopt. More 
formally, X = V, while W = {Qi, ■ ■ ■, Q w *} for w* different 
actions. In essence, the MDP is a direct extension of the CTMC 
upon which the governing equation of the network diffusion is 
based Q By selecting a reward function R(X, W) that describes 
the desirability of action W given that the system is in state 
X, and by scheduling decisions to be made immediately after 
a change in the state of the systerrj^] the MDP is completely 
defined. We can determine the conditions for optimality in the 
MDP by searching the policy that yields the optimal expected 
reward integrated over time 


E 


X(0),VF(0), sample paths 


7* R(X{t),W{t))dt 


(54) 


where 7 £ (0,1) denotes a discount rate. By considering 
© and ( [54} simultaneously, we obtain the Hamilton-Jacobi- 
Bellman equation for the system. In optimal control theory, 
this equation is solved to obtain the optimal policy for the 
system. For large state and/or action spaces, this solution can 
be difficult to obtain. For a more computationally manage¬ 
able approach, we turn to reinforcement learning to deter¬ 
mine a suboptimal solution with sufficiently fast convergence. 
For each state-action pair ( X , W), we store a quality value 
V^(X,W), and update V^(X, W) based on the Q-learning 
method {26) 


s+1 


(X,W) = (l-p)V k Q (X,W) 




where k is the index indicating the /c-th change of the system 
state, p £ (0,1] is the learning rate of the algorithm, and A' next 
is the next state of the system. Whenever the system transitions 
to a new state, a new action has to be selected. We set up the 
system such that with probability e, the system selects the 
action with the highest quality given its current state, while 
with probability 1 — e, the system selects a random action. 


4 A MDP with a single action is equivalent to a CTMC. 

"This means that the variation of the active action W(t) with time is a 
piecewise constant function. 
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Fig. 16: Time evolution of quality I/®(1, W) of node 1 for 5 different actions 
W for a single trial. 



Fig. 17: Stationary probability distribution of the simulated network averaged 
over 100 trials. The nodes marked red are the two randomly selected nodes 
(12 and 14) that were assigned a reward on every visit by each MDP. 


This is known as the e-greedy policy where e is the exploitation 
probability of the system. By selecting appropriate values of /i, 
7 and e, as well as an appropriate reward function R(X. W), 
we can design the learning process to achieve predefined 
system objectives at a suitable convergence rate. 

To illustrate the use of an MDP to select a desirable Q 
in a large network, we set up a state space X containing 20 
agents, and we set up the action space W by generating 400 
Q matrices corresponding to 400 complete graphs whose links 
each have weights sampled from a uniform distribution on 
[0,1], to simulate 400 different random network configurations. 
These matrices obey the conservative relation m which 
obeys the update rule In addition, we select 2 arbitrary 
nodes (12 and 14) out of the 20 to serve as target nodes for our 
system. We let the state transitions of the MDP be governed 
by a grand system transition matrix Q g . Whenever a state 
transition occurs, we assign a reward of 5 whenever the state 
coincides with 1 of these 2 target nodes, and a reward of 0 
otherwise, in order to encourage the system to visit these states 
more often. The MDP then selects an action, following which 
Q g is updated based on this selected action. For example, if 
the updated state of the system is node 5 and action 20 is 
selected, then the 5th column of Q g is replaced by the 5th 
column of Q 2 o corresponding to the 20th action. 

For this work, we ran 100 independent numerical simula¬ 
tions describing the time evolution of 100 MDPs with the same 
state and action spaces, but with different initial states and 


Q g . Using the reinforcement learning parameters // = 0.2, 
e = 0.4 and 7 = 0.995, we run each simulation for 5E6 time- 
steps, following which we find the steady-state eigenvector 
(normalized to sum to 1), or stationary distribution, Vo for 
each last known Q g such that Q g v 0 = 0 and JA vq( i) = 1. In 
Fig. 16 we demonstrate using the time evolution of V®(1, W) 
for 5aifferent actions W that the learning curve has reached 
convergence. In Fig. ED we plot the last known stationary 
distribution vq (i) over all the states i averaged over the 100 
trials, and demonstrate that the MDP is, on average, able to 
modify the stationary distribution of the system after sufficient 
time has elapsed by careful design of the reward function. 

In this case, we guided the system to favor nodes 12 and 
14 by modifying the network structure, analogous to how we 
modified the system steady state in Section V-B through the 
injection of appropriate external inputs. This MDP methodol¬ 
ogy can either be used online to allow the network to respond 
to time-varying objectives, or offline to cycle through various 
network possibilities with reduced computational complexity. 


VII. Conclusion 

In this work, we proposed a probabilistic framework that 
describes diffusion in multi-agent networks and allows for 
a variety of inter-agent update rules. We focused on two 
protocols where the total network quantity is conserved (con¬ 
servative protocol) or variable (non-conservative protocol). By 
including the possibility of asymmetric updates, non-unitary 
links and switching topologies, we examined the stability and 
convergence characteristics of generic networks, and described 
their transient and steady-state dynamics in detail. In addition, 
we modeled external influences on networks by considering an 
inhomogeneous form of the governing equation. As such, the 
framework has the potential to capture the presence of stubborn 
agents and the process of dynamic learning. Furthermore, we 
demonstrated the ability to achieve control of diffusion in these 
networks using either external excitation or modification of the 
network structure. For the former, we showed that the form of 
the external excitation and the network structure are crucial to 
the characteristics of the controlled output, and for the latter, 
we identified an algorithm involving reinforcement learning 
for an MDP to promote quick achievement of the control 
objective. Through these techniques, we enable the possibility 
of actuating individual nodes and steering the system steady 
state towards some desired control objective, thereby allowing 
for the micromanagement of diffusion in networks. 

Our future work to extend the current toolbox will include 
more direct applications to large random networks. In addition, 
we intend to include time-inhomogeneous and stochastic gov¬ 
erning equations, and to extend the framework to continuous 
node spaces. Another important extension of this work is to 
develop more insight in the higher order moments of the 
property. 
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